Adaptive allocations improvements #126307

jan-elastic · 2025-04-04T14:12:58Z

various small fixes

elasticsearchmachine · 2025-04-04T14:14:09Z

Pinging @elastic/ml-core (Team:ML)

jan-elastic · 2025-04-04T14:14:48Z

...c/main/java/org/elasticsearch/xpack/core/ml/inference/assignment/TrainedModelAssignment.java

     * @param reason may contain a human-readable explanation for the current state
     * @param startTime the time when the assignment was created
-     * @param maxAssignedAllocations used for adaptive allocations
+     * @param maxAssignedAllocations keeps track of the maximum number of allocations used for this assignment


I don't know what this is exactly used for. It looks like only for recording, not for decision making. Definitely not used by adaptive allocations.

jan-elastic · 2025-04-04T14:15:36Z

...g/elasticsearch/xpack/ml/inference/adaptiveallocations/AdaptiveAllocationsScalerService.java

                }
+                if (assignmentStates.get(deploymentId) != AssignmentState.STARTED) {
+                    logger.debug(
+                        "adaptive allocations scaler: skipping scaling [{}] because it is in [{}] state.",


Updating a model that's not in the STARTED state leads to errors in the logs.

jan-elastic · 2025-04-04T14:16:41Z

...rc/main/java/org/elasticsearch/xpack/ml/inference/assignment/planning/AssignmentPlanner.java

        this.nodes = nodes.stream().sorted(Comparator.comparing(Node::id)).toList();
-        this.deployments = deployments.stream().sorted(Comparator.comparing(AssignmentPlan.Deployment::deploymentId)).toList();
+        this.deployments = deployments.stream()
+            .filter(deployment -> deployment.allocations() > 0)


If these aren't filtered, computePlan will

Plan with at least one allocation for previously assigned models

elasticsearchmachine · 2025-04-07T06:48:31Z

💔 Backport failed

Status	Branch	Result
❌	8.18	Commit could not be cherrypicked due to conflicts
❌	8.x	Commit could not be cherrypicked due to conflicts
✅	9.0
❌	8.17	Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 126307

* Adaptive allocations: don't update deployments that aren't started. * AssignmentPlanner: don't plan deployments with zero allocations * Update JavaDoc

jan-elastic added 3 commits April 4, 2025 16:11

Adaptive allocations: don't update deployments that aren't started.

e430d32

AssignmentPlanner: don't plan deployments with zero allocations

8bcb917

Update JavaDoc

b43d2e6

elasticsearchmachine added needs:triage Requires assignment of a team area label v9.1.0 labels Apr 4, 2025

jan-elastic added :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.18.1 v8.19.0 v9.0.1 v8.17.5 >non-issue and removed needs:triage Requires assignment of a team area label labels Apr 4, 2025

jan-elastic requested a review from jonathan-buttner April 4, 2025 14:13

jan-elastic commented Apr 4, 2025

View reviewed changes

jan-elastic assigned davidkyle and jonathan-buttner Apr 4, 2025

jonathan-buttner approved these changes Apr 4, 2025

View reviewed changes

prwhelan approved these changes Apr 4, 2025

View reviewed changes

jan-elastic merged commit 24909ca into main Apr 7, 2025
18 checks passed

jan-elastic deleted the adaptive-allocations-improvements branch April 7, 2025 06:46

jan-elastic mentioned this pull request Apr 7, 2025

[9.0] Adaptive allocations improvements (#126307) #126379

Merged

elasticsearchmachine added the backport pending label Apr 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adaptive allocations improvements #126307

Adaptive allocations improvements #126307

Uh oh!

jan-elastic commented Apr 4, 2025

Uh oh!

elasticsearchmachine commented Apr 4, 2025

Uh oh!

jan-elastic Apr 4, 2025

Uh oh!

jan-elastic Apr 4, 2025

Uh oh!

jan-elastic Apr 4, 2025

Uh oh!

Uh oh!

elasticsearchmachine commented Apr 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Adaptive allocations improvements #126307

Adaptive allocations improvements #126307

Uh oh!

Conversation

jan-elastic commented Apr 4, 2025

Uh oh!

elasticsearchmachine commented Apr 4, 2025

Uh oh!

jan-elastic Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

jan-elastic Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

jan-elastic Apr 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Apr 7, 2025

💔 Backport failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants